Asynchronous MPI for the Masses
نویسندگان
چکیده
We present a simple library which equips MPI implementations with truly asynchronous non-blocking point-to-point operations, and which is independent of the underlying communication infrastructure. It utilizes the MPI profiling interface (PMPI) and the MPI_THREAD_MULTIPLE thread compatibility level, and works with current versions of Intel MPI, Open MPI, MPICH2, MVAPICH2, Cray MPI, and IBM MPI. We show performance comparisons on a commodity InfiniBand cluster and two tier-1 systems in Germany, using lowlevel and application benchmarks. Issues of thread/process placement and the peculiarities of different MPI implementations are discussed in detail. We also identify the MPI libraries that already support asynchronous operations. Finally we show how our ideas can be extended to MPI-IO.
منابع مشابه
Prospects for Truly Asynchronous Communication with Pure MPI and Hybrid MPI/OpenMP on Current Supercomputing Platforms
We investigate the ability of MPI implementations to perform truly asynchronous communication with nonblocking point-to-point calls on current highly parallel systems, including the Cray XT and XE series. For cases where no automatic overlap of communication with computation is available, we demonstrate several different ways of establishing explicitly asynchronous communication by variants of ...
متن کاملA Practical Method to Implement Asynchronous Iterative Algorithms on MPI and a Case Study for Asynchronous Self-Organizing Maps
In this paper, an effective implementation scheme for asynchronous parallel iterative algorithms on messagepassing systems using MPI non-blocking communication model is proposed. The main idea of the method is to use a MPI_IPROBE function to check for the existence of pending messages without receiving them, thereby allowing us to write programs that interleave local computation with the proces...
متن کاملCone Beam Tomography using MPI on Heterogeneous Workstation Clusters - MPI Developer's Conference, 1996. Proceedings., Second
With cone beam CT long computation and a large amount of memory is required. The majority of the computation is in the backprojection step. We have investigated the use of parallel methods on workstation clusters using MPI to overcome both the long processing t ime and lessen memory requirements on individual workstations. We used the asynchronous MPI implementation t o reconstruct a 25€i3 volu...
متن کاملAdaptive multivariate integration using MPI
We describe a coarse grain parallel algorithm for multivariate adaptive integration using MPI. The algorithm is asynchronous in nature and allows for load balancing. Timing results show good speedups obtained on a network of workstations for a class of integrals from Bayesian statistics.
متن کاملLock-Free Asynchronous Rendezvous Design for MPI Point-to-Point Communication
Message Passing Interface (MPI) is the most commonly used method for programming distributed-memory systems. Most MPI implementations use a rendezvous protocol for transmitting large messages. One of the features desired in a MPI implementation is the ability to asynchronously progress the rendezvous protocol. This is important to provide potential for good computation and communication overlap...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1302.4280 شماره
صفحات -
تاریخ انتشار 2013